Avoid recursive blanket impl checks#155765
Avoid recursive blanket impl checks#155765chenyukang wants to merge 2 commits intorust-lang:mainfrom
Conversation
|
r? @fmease rustbot has assigned @fmease. Use Why was this reviewer chosen?The reviewer was selected based on:
|
This comment has been minimized.
This comment has been minimized.
The CI failure is caused after this PR, impl<T> From<!> for Twhen searching I think it's ok to update the test case. |
There was a problem hiding this comment.
This can just be a rustdoc UI test since we don't care about the HTML. For that move, it into tests/rustdoc-ui/, remove the #![crate_name], the htmldocck directives and add //@ check-pass
| @@ -0,0 +1,268 @@ | |||
| #![crate_name = "foo"] | |||
There was a problem hiding this comment.
In a source code comment at the top, can you add which initial (auto trait, ADT) pairing(s) used to cause the blowout (unless that info isn't meaningful)
There was a problem hiding this comment.
I'm curious, does this have any effect on issue #114891?
| let predicate = infcx.resolve_vars_if_possible(predicate); | ||
| if let Some(may_apply) = | ||
| cached_auto_trait_predicate_result(cx, predicate, impl_ty, item_ty) |
There was a problem hiding this comment.
It's very odd to me that we have to special case auto traits here when synthesizing blanket impls, it feels like we're manually helping out the trait solver here when it's likely the old trait solver's fault?
Well, I don't actually know the actually fast-expanding tree of obligations in dtolnay's example. Could you enlighten me?
There was a problem hiding this comment.
I'm gonna rebase my dusty PR #125907, impl lcnr's final suggestion and check if that PR also fixes the issue.
There was a problem hiding this comment.
ok, so next solver should be a better fix for it.
There was a problem hiding this comment.
It's very odd to me that we have to special case auto traits here when synthesizing blanket impls, it feels like we're manually helping out the trait solver here when it's likely the old trait solver's fault?
Well, I don't actually know the actually fast-expanding tree of obligations in dtolnay's example. Could you enlighten me?
It's very odd to me that we have to special case auto traits here when synthesizing blanket impls, it feels like we're manually helping out the trait solver here when it's likely the old trait solver's fault?
Well, I don't actually know the actually fast-expanding tree of obligations in dtolnay's example. Could you enlighten me?
The expanding obligation tree I had in mind is from checking the blanket impl:
impl<T: Send + Sync> ThreadSafe for T {}for Class<Span>.
To synthesize impl ThreadSafe for Class<Span>, rustdoc needs to prove the impl predicates:
Class<Span>: Send
Class<Span>: Sync
Each of those auto-trait goals expands through the fields of Class:
Class<Span>
-> Arc<[Statement<Span>]>
-> Arc<[Expression<Span>]>
Here N means the number of recursive wrapper variants in each enum: the
S00..S(N-1) variants in Statement and the E00..E(N-1) variants in
Expression. In dtolnay's example, N = 60 for each of those two enums.
Then Statement and Expression each contain two kinds of recursive edges:
- a
Class(Arc<Class<Span>>)variant, which leads back toClass<Span>; Nwrapper variants, each of which leads back to the same enum type.
So one layer of expansion is already O(N):
Statement<Span>: Send
-> Class variant -> Class<Span>: Send/Sync
-> S00<Span>: Send -> Statement<Span>: Send
-> S01<Span>: Send -> Statement<Span>: Send
...
-> S(N-1)<Span>: Send -> Statement<Span>: Send
and similarly for Expression<Span>.
The shape is a recursive cycle with an O(N) fan-out at each layer:
Class
-> Statement / Expression
-> Class
-> N wrapper variants
-> Statement / Expression
-> Class
-> N wrapper variants
-> ...
If the same auto-trait goals are re-expanded instead of being reused as part of the recursive cycle, the explored tree grows roughly like:
1 + O(N) + O(N^2) + O(N^3) + ...
In dtolnay's example, that matches the observed behavior: adding 10 more variants roughly doubled the rustdoc runtime. So the practical behavior is exponential-like in the number of variants.
| pub(crate) generated_synthetics: FxHashSet<(Ty<'tcx>, DefId)>, | ||
| /// Polarity of synthesized auto-trait impls processed so far. | ||
| pub(crate) generated_auto_trait_impls: FxHashMap<(Ty<'tcx>, DefId), ty::ImplPolarity>, |
There was a problem hiding this comment.
Don't try to address this now since I want to test my other PR first. I'd just like to note that it's confusing to name this generated_auto_trait_impls because generated_synthetics directly above caches both synthetic blanket and synthetic auto trait impls (which is actually problematic, they should be separate caches, see #148980) meaning we would now have "two" auto trait impl caches which are slightly different.
| @@ -4,6 +4,7 @@ const EXPECTED = [ | |||
| { | |||
| 'query': '! ->', | |||
There was a problem hiding this comment.
I need to ponder about this behavior change; obviously optimizations like caching shouldn't observably affect the semantics of a program and if they do that indicates a bug somewhere even if the change is 'desirable'.
I've just woken up and I've only skimmed this PR, so this is just a hunch: I see you're somehow involving ImplPolarity in your cache and the impl in question is impl<T> From<!> for T which is a reservation impl (ImplPolarity::Reservation), so you might not be handling that 'correctly'.
Again, it's possible that the change is desirable (I haven't thought about that yet) but then that should be explicitly encoded outside of the caching logic.
| E58(std::sync::Arc<E58<Span>>), | ||
| E59(std::sync::Arc<E59<Span>>), | ||
| } | ||
| } |
There was a problem hiding this comment.
If we want to assert the test of
rustdocmust finished in specific time, we need to add arun-maketest instead?
That seems iffy to me as it can easily render the test flaky. It could lead to the test failing on slow machines or in CI if the OS is under heavy workload and suspends the process for a while.
|
I didn't forget about this, I'll revisit this soon. Having such a targeted fix seems a bit iffy but OTOH the next-solver migration might be blocked on next-solver perf improvements (need to double check if there's anything else I can do instead of that). I'm curious @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…impl, r=<try> Avoid recursive blanket impl checks
This comment has been minimized.
This comment has been minimized.
|
Finished benchmarking commit (1260cee): comparison URL. Overall result: ❌✅ regressions and improvements - please read:Benchmarking means the PR may be perf-sensitive. It's automatically marked not fit for rolling up. Overriding is possible but disadvised: it risks changing compiler perf. Next, please: If you can, justify the regressions found in this try perf run in writing along with @bors rollup=never Instruction countOur most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.
Max RSS (memory usage)Results (primary -2.8%, secondary -0.5%)A less reliable metric. May be of interest, but not used to determine the overall result above.
CyclesResults (primary -2.5%, secondary 0.2%)A less reliable metric. May be of interest, but not used to determine the overall result above.
Binary sizeThis perf run didn't have relevant results for this metric. Bootstrap: 508.875s -> 500.388s (-1.67%) |
View all comments
Fixes #155759
Avoid recursively re-checking rustdoc blanket impl candidates while synthesizing impls.
Also record the polarity of synthesized auto-trait impls and reuses that result when checking blanket impl predicates.
If we want to assert the test of
rustdocmust finished in specific time, we need to add arun-maketest instead?I tests it now runs about
2.2sfor 60 variants on my local machine.